-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements, up to 20x faster #183
Conversation
Excellent. |
What I wonder is whether some of this can be backported to the string-only filepath or whether the new API introduced performance regressions (a few functions were slightly rewritten). |
I get compilation error on 9.2.6:
|
The changes improve performance both for the legacy API and for the new one.
It's a warning for me, and present in the |
-- then break filename into basename and extension, then recombine dir and basename. | ||
-- This is way too expensive, see @splitFileName_@ comment for discussion. | ||
-- | ||
-- Instead we speculatively split on the extension separator first, then check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice trick
If you rebase against master and then push to this repo instead of your fork, we can benefit from the fixed CI. |
-- If bs' is empty, then s2 as the last character of dirSlash must be a path separator, | ||
-- so we are in the middle of shared drive. | ||
-- Otherwise, since s1 is a path separator, we might be in the middle of UNC path. | ||
, null bs' || maybe False (null . snd) (readDriveUNC dirSlash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be honest... this code will be harder to maintain in the face of fixing bugs like this:
ghci> W.splitFileName "\\\\.\\C:"
("\\\\.\\","C:")
These will be eventually fixed when I get around finishing #99
So I can't guarantee there won't be performance regressions in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I'll be happy to take another look at performance after #99.
FWIW this branch deals with Windows file names starting with \\
, so it's a relatively uncommon code path, should not be a bottleneck anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with merging. If I ever manage to finish #99, then it will be a major bump and almost none of the existing code will remain.
The idea is to parse into an ADT and then deal with that only. My guess is that will increase allocations, so we'll see performance regression.
@hasufell any more comments / suggestions? |
@Bodigrim we have a regression: With this patch (1.4.100.2):
With 1.4.100.0:
This was by chance found by a property test (they are rather unreliable): https://github.com/haskell/filepath/actions/runs/4263893204/jobs/7421218609#step:5:1112 However, |
Resolves #23741 Metric Decrease: MultiComponentModules MultiComponentModulesRecomp MultiLayerModules MultiLayerModulesRecomp T10421 T12234 T12425 T13035 T13701 T13719 T16875 T18304 T18698a T18698b T21839c T9198 TcPlugin_RewritePerf hard_hole_fits Metric decrease on Windows can be probably attributed to haskell/filepath#183
Resolves #23741 Metric Decrease: MultiComponentModules MultiComponentModulesRecomp MultiLayerModules MultiLayerModulesRecomp T10421 T12234 T12425 T13035 T13701 T13719 T16875 T18304 T18698a T18698b T21839c T9198 TcPlugin_RewritePerf hard_hole_fits Metric decrease on Windows can be probably attributed to haskell/filepath#183 (cherry picked from commit e9a0fa3)
@hasufell That regression was fixed in 6a1c98d, correct? I noticed at least one other change in behavior which I discovered by chance in the $ ghci-9.4.8
GHCi, version 9.4.8: https://www.haskell.org/ghc/ :? for help
ghci> import qualified System.FilePath.Windows as W
ghci> W.takeDirectory "//?/a:"
"//?/a:"
$ ghci-9.8.1
GHCi, version 9.8.1: https://www.haskell.org/ghc/ :? for help
ghci> W.takeDirectory "//?/a:"
"//?/" Can you comment whether this is also to be considered a regression? |
|
In one of my projects I've been trying to migrate from home-grown low-level helpers to
filepath
API (e. g.,takeExtension fn == ".foo"
instead of".foo" `B.isSuffixOf` fn
) but failed miserably: performance penalties were too harsh, especially on Windows.I fixed some low-hanging fruits, and I think results are quite good: many functions are twice faster now, and
System.OsPath
on Windows is often 10x-20x faster. Here is a detailed report: